287 research outputs found
DutchHatTrick: semantic query modeling, ConText, section detection, and match score maximization
This report discusses the collaborative work of the ErasmusMC, University of Twente, and the University of Amsterdam on the TREC 2011 Medical track. Here, the task is to retrieve patient visits from the University of Pittsburgh NLP Repository for 35 topics. The repository consists of 101,711 patient reports, and a patient visit was recorded in one or more reports
Associative conceptual space-based information retrieval systems
In this `Information Era' with the availability of large collections of books, articles, journals, CD-ROMs, video films and so on, there exists an increasing need for intelligent information retrieval systems that enable users to find the information desired easily. Many attempts have been made to construct such retrieval systems, including the electronic ones used in libraries and including the search engines for the World Wide Web. In many cases, however, the so-called `precision' and `recall' of these systems leave much to be desired.
In this paper, a new AI-based retrieval system is proposed, inspired by, among other things, the WEBSOM-algorithm. However, contrary to that approach where domain knowledge is extracted from the full text of all books, we propose a system where certain specific meta-information is automatically assembled using only the index of every document. This knowledge extraction process results into a new type of concept space, the so-called Associative Conceptual Space where the `concepts' as found in all documents are clustered using a Hebbian-type of learning algorithm. Then, each document can be characterised by comparing the concepts as occurring in it to those present in the associative conceptual space. Applying these characterisations, all documents can be clustered such that semantically similar documents lie close together on a Self-Organising Map. This map can easily be inspected by its user
Efficient GPU-accelerated fitting of observational health-scaled stratified and time-varying Cox models
The Cox proportional hazards model stands as a widely-used semi-parametric
approach for survival analysis in medical research and many other fields.
Numerous extensions of the Cox model have further expanded its versatility.
Statistical computing challenges arise, however, when applying many of these
extensions with the increasing complexity and volume of modern observational
health datasets. To address these challenges, we demonstrate how to employ
massive parallelization through graphics processing units (GPU) to enhance the
scalability of the stratified Cox model, the Cox model with time-varying
covariates, and the Cox model with time-varying coefficients. First we
establish how the Cox model with time-varying coefficients can be transformed
into the Cox model with time-varying covariates when using discrete
time-to-event data. We then demonstrate how to recast both of these into a
stratified Cox model and identify their shared computational bottleneck that
results when evaluating the now segmented partial likelihood and its gradient
with respect to regression coefficients at scale. These computations mirror a
highly transformed segmented scan operation. While this bottleneck is not an
immediately obvious target for multi-core parallelization, we convert it into
an un-segmented operation to leverage the efficient many-core parallel scan
algorithm. Our massively parallel implementation significantly accelerates
model fitting on large-scale and high-dimensional Cox models with
stratification or time-varying effect, delivering an order of magnitude speedup
over traditional central processing unit-based implementations
Massive Parallelization of Massive Sample-size Survival Analysis
Large-scale observational health databases are increasingly popular for
conducting comparative effectiveness and safety studies of medical products.
However, increasing number of patients poses computational challenges when
fitting survival regression models in such studies. In this paper, we use
graphics processing units (GPUs) to parallelize the computational bottlenecks
of massive sample-size survival analyses. Specifically, we develop and apply
time- and memory-efficient single-pass parallel scan algorithms for Cox
proportional hazards models and forward-backward parallel scan algorithms for
Fine-Gray models for analysis with and without a competing risk using a cyclic
coordinate descent optimization approach We demonstrate that GPUs accelerate
the computation of fitting these complex models in large databases by
orders-of-magnitude as compared to traditional multi-core CPU parallelism. Our
implementation enables efficient large-scale observational studies involving
millions of patients and thousands of patient characteristics
- …